Incremental Weighted Naive Bays Classifiers for Data Stream

نویسندگان

  • Christophe Salperwyck
  • Vincent Lemaire
  • Carine Hue
چکیده

A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes’ theorem with naive independence assumption. The explanatory variables (Xi) are assumed to be independent from the target variable (Y ). Despite this strong assumption this classifier has proved to be very effective on many real applications and is often used on data stream for supervised classification. The naive Bayes classifier simply relies on the estimation of the univariate conditional probabilities P (Xi|C). This estimation can be provided on a data stream using a ”supervised quantiles summary”. The literature shows that the naive Bayes classifier can be improved (i) using a variable selection method (ii) weighting the explanatory variables. Most of these methods are related to batch (off-line) learning and need to store all the data in memory and/or require reading more than once each example. Therefore they cannot be used on data stream. This paper presents a new method based on a graphical model which computes the weights on the input variables using a stochastic estimation. The method is incremental and produces an Weighted Naive Bayes Classifier for data stream. This method will be compared to classical naive Bayes classifier on the Large Scale Learning challenge datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Learning of Tree Augmented Naive Bayes Classifiers

Machine learning has focused a lot of attention at Bayesian classifiers in recent years. It has seen that even Naive Bayes classifier performs well in many cases, it may be improved by introducing some dependency relationships among variables (Augmented Naive Bayes). Naive Bayes is incremental in nature but, up to now, there are no incremental algorithms for learning Augmented classifiers. When...

متن کامل

Title: Incremental Learning of Tree Augmented Naive Bayes Classifiers Authors:

Machine learning has focused a lot of attention at Bayesian classifiers in recent years. It has seen that even Naive Bayes classifier performs well in many cases, it may be improved by introducing some dependency relationships among variables (Augmented Naive Bayes). Naive Bayes is incremental in nature but, up to now, there are no incremental algorithms for learning Augmented classifiers. When...

متن کامل

Dynamic Weighted Majority for Incremental Learning of Imbalanced Data Streams with Concept Drift

Concept drifts occurring in data streams will jeopardize the accuracy and stability of the online learning process. If the data stream is imbalanced, it will be even more challenging to detect and cure the concept drift. In the literature, these two problems have been intensively addressed separately, but have yet to be well studied when they occur together. In this paper, we propose a chunk-ba...

متن کامل

Incremental Augmented Naive Bayes Classifiers

We propose two general heuristics to transform a batch Hill-climbing search into an incremental one. Our heuristics, when new data are available, study the search path to determine whether it is worth revising the current structure and if it is, they state which part of the structure must be revised. Then, we apply our heuristics to two Bayesian network structure learning algorithms in order to...

متن کامل

Mining Concept-Drifting Data Streams

Knowledge discovery from infinite data streams is an important and difficult task.We are facing two challenges, the overwhelming volume and the concept drifts of the streaming data. In this chapter, we introduce a general framework for mining concept-drifting data streams using weighted ensemble classifiers. We train an ensemble of classification models, such as C4.5, RIPPER, naive Bayesian, et...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013